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1 Introduction 



Imagine a spatial branching process in which the child of an individual either is a homebody, 
i.e. remains at the same site as its parent, or migrates to a new location which has never 
been occupied before and then founds its own colony. We assume that the reproduction law 
is the same for homebodies and migrants and do not depend on the spatial location either, so 
this is essentially a discrete version of the Virgin Island Model of Hutzenthaler [12] when local 
competition between individuals is discarded; see also [13] and references therein. 

The dynamics of the process are entirely determined by the pair (£ h , £ m ) of integer valued 
random variables giving the number of homebody children and the number of migrant chil- 
dren of a typical individual. The special case where each child choses to emigrate with a fixed 
probability p G (0, 1) and independently of the other children can be interpreted in the frame- 
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work of the infinite-sites model in population genetics by identifying a spatial location with a 
locus on a chromosome and p with the rate of neutral mutations. This setting has motivated a 
number of works in the literature, see in particular Griffith and Pakes [10] and Taib |19J . In a 
quite different direction, we may also consider for instance the cut-off situation where there is 
a threshold k such that the first k children of an individual are always homebodies while the 
next children (if any) are forced to migrate. We may think of many other simple rules as there 
are no assumptions on the correlation between £ h and £ m . 

We are interested in statistics of the decomposition of the entire population according to the 
locations of individuals, which we call the partition into colonies. This partition is naturally 
endowed with a genealogical tree structure which has been described in [4] . The recent work [5] 
focussed on the special case where neutral mutations occur in a Galton- Watson process with a 
fixed reproduction law which is critical and has a finite variance. In asymptotic regimes where 
the population is large and the mutation rate small, we established a weak limit theorem for 
the tree of alleles, that is, in the present setting, the partition into colonies equipped with its 
genealogical structure. The limit was described in terms of the genealogical tree of a continuous 
state branching process in discrete time (c.f. Jirina [2]) with an inverse Gaussian reproduction 
measure. In a related direction, we also point at recent work by Abraham et al. [2] on pruning 
Levy continuum random trees. 

In the present paper, we shall investigate more generally asymptotics of the partition into 
colonies (ignoring its genealogical structure) for branching processes with emigration when 
populations are large and migrations rare. The regimes of interest are related to the well- 
known limit theorems for rescaled Galton- Watson processes towards continuous state branching 
processes in continuous time. Our main result (Theorem |2j) states that after an appropriate 
rescaling, the partition into colonies converges weakly to some random point measure. The 
latter is constructed from the restriction of a Poisson point measure to a certain random region. 
An important step in our analysis is that, although in general the cumulant of this limiting 
random measure is not explicitly known, it can be characterized as the unique solution to a 
rather simple integral equation. 

Let us briefly present the plan of this work by explaining our approach. In Section 2, we point 
at the fact that the cumulant of the partition into colonies solves a certain integral equation. 
This equation stems from the extended branching property that is fulfilled by the partition, 
and is given in terms of the distribution of a pair of random variables which arise naturally in 
this setting. We also recall a useful identity in law which relates the preceding variables to that 
of passage times in certain random walks. 

In Section 3, we consider a sequence of branching processes with emigration and introduce 
the basic assumptions. These are closely related to the classical limit theorems for rescaled 
Galton- Watson processes and involve Levy processes with no negative jumps. Motivated by 
Section 2, we investigate limits in distribution for passage times of random walks, and point at 
the role of the Levy measure of a bivariate subordinator which arises in this setting. 

In Section 4, we introduce a family of random point measures which are constructed from 
Poisson point measures on a product space by restriction to certain random domains. The 
main feature is that the cumulant of such a random measure can be characterized as the 
unique solution to another integral equation involving the intensity measure of the underlying 
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Poisson measure. 

Our main result for limits in law of partitions into colonies is presented and proved in 
Section 5. Roughly, we show that the cumulants of the partitions into colonies of a sequence 
of branching processes with emigration converge after an appropriate rescaling to the unique 
solution of an equation of the type which appeared in Section 4. More precisely, it corresponds 
to the case where the intensity of the driving Poisson measure is given by the Levy measure 
that has arisen in Section 3. 

Finally, Section 6 is devoted to a few (hopefully) interesting examples, partly to demonstrate 
the variety of possible asymptotic behaviors. Roughly speaking, the common feature in these 
examples is that the Galton- Watson process for which spatial locations of individuals are ignored 
has a fixed distribution. We shall consider different natural possibilities for selecting migrants 
children amongst the progeny of an individual, which will yield different limiting partitions into 
colonies. In the case corresponding to rare neutral mutations in the infinite alleles branching 
process, the limiting random partition can be described in terms of certain Poisson-Kingman 
partitions which have been considered by Pitman [T7] . 

2 Preliminaries on partitions into colonies 

In with section, we briefly introduce notation and present some basic properties for Galton- 
Watson processes with emigration and the induced partitions into colonies. The material is 
essentially adapted from [I] and [5] to which we refer for details, with the exception of Lemma 
[1] which is new. 

Roughly speaking, we consider a spatial haploid population model with discrete non-overlaping 
generations where each individual begets independently of the others, according to a fixed re- 
production law which is independent of the location of that individual. We do not specify 
geometrical details of the space where individual lives as this would be irrelevant for the study; 
the only implicit assumption is that this space is infinite. A child can either stay at the same 
site as its parent or migrate to a new site which has never been occupied before and then found 
its own colony. This child is called a homebody in the first case, and a migrant in the second. 
For the sake of simplicity, we shall assume in this work that at the initial time each ances- 
tor lives in a different location, although arbitrary initial conditions could be dealt with more 
generally. The law of this model is thus entirely determined by the number of ancestors and 
a pair of integer- valued random variables (£ h , £ m ) which should be thought of as the number 
of homebody children and the number of migrant children of a typical individual. For every 
a G N, we use the notation F a for the probability measure under which this model starts from 
a ancestors. 

If spatial locations are discarded, then the total number of individuals per generation clearly 
forms a standard Galton- Watson process with reproduction law given by the distribution of 
£ = £ h + £ m . We always assume that this Galton- Watson process is critical or sub-critical, viz. 

< 1, and implicitly exclude the degenerate case where £ = 1, so the population becomes 
eventually extinct a.s. The main object of interest in this work is the partition into colonies, 
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which we represent as a random discrete measure 

Here 7 is the total number of colonies (that is occupied sites) and Cj denotes the total number 
of individuals that lived at the j-th colony. Observe that the first moment of V coincides with 
the total population of the Galton- Watson process, viz. 

7 C 
3=1 k=i 

where = Cfe+^™ stands for the number of children of the fc-th individual for some enumeration 
procedure, and that the mass of V is just the number of colonies 

C 

k=i 

We denote the cumulant of partition into colonies when there is a single ancestor by 

K(/) = -lnE 1 (exp(-(P,/))), 
where / : N — > IR+ stands for a generic function and 

<p, /> = £/(<?,■) ■ 

We also point out from the branching property that for an arbitrary number of ancestors oeN 
we have 

E a (exp(-(7>,/)))=exp(-aX(/)), 
hence the cumulant K characterizes the law V under P a for any a > 1. 

The starting point of our analysis relies on the fact that this cumulant is determined in 
terms of the distribution of a pair of random variables which appear naturally in the branching 
process with emigration. Specifically, imagine for a while a variation of the model starting from 
a single ancestor in which migrants are sterilized (i.e. they have no offspring). We denote by 
C the total number of individuals that lived at the same site as the ancestor and by M the 
number of sterilized migrant children. In other words, C is the size of the colony generated by 
the ancestor and M the number of colonies which have been founded by migrant children of 
the ancestral colony. 

Lemma 1 For every function f : N — > R + , the cumulant K(f) of V is the unique solution 
A > to the equation 

e- A = E 1 (exp(-/(C)-AM)). 
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Proof: This stems from the branching property which is inherited by the partition into 
colonies. More precisely, we work under and decompose the total population into the ances- 
tral colony and families generated by the migrant children of that colony Because the descent 
of each migrant child has the same distribution as the initial spatial Galton- Watson process, 
independently of the other migrant children and of the homebody offspring of the ancestor, this 
yields 

exp{-K(f)) = E 1 (e W (-(PJ))) 

= E 1 (exp(-/(C))E M (ex P (-(P,/)))) 
= E 1 (eM-f(C)-K(f)M)). 

We refer to Chauvin [B] for a rigorous formulation of the extended branching property of Galton- 
Watson processes at stopping lines that we have used above, and also to [I] for an alternative 
argument based on the strong Markov property of random walks. 

Uniqueness of the solution follows from the following observation. Suppose first that }'{C) ^ 
0. By Holder's inequality, the map 

A -> A + lnEi(exp(-/(C) - AM)) 

is convex and its value at A = is negative. Hence it can take the value for a single value of 
A > at most. When f{C) = 0, the equation reduces to 

e~ A = Ei(exp(-AM)). 

Recall that the Galton- Watson process is critical or sub-critical, so Ei(M) < 1 according to 
Corollary 1 of [5]. It is well-known that this ensures uniqueness of the solution to the preceding 
equation. □ 

Lemma [T] provides an implicit characterization of the law of the partition into colonies 
through that of the pair of random variables (C, M) . In turn, the latter can be conveniently 
described in terms of a pair of random walks. This has its root in a key observation for Galton- 
Watson processes that goes back to Harris [IT] , and will have an important role here for the 
analysis of asymptotic behaviors. Specifically, consider 

S h k =g + --- + e k -k and = + . + k e Z + . 

Next define for every integer j > the first passage time 

Tj = inf{* : S h k = -j}. 

We lift the following useful identity from Lemma 3 in [5] : 

Lemma 2 The pair (ti,S™) has the same law as (C,M). 

We refer to Theorem l(ii) in [4j or to Proposition 1 in [5] for an explicit formula for this 
distribution which is obtained by a combinatorial argument and extends the well-known result 
of Dwass and Kemperman |S] for the total population of Galton- Watson processes and passage 
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times of downward skip free random walks. In this direction we also mention that the sequence 
of the atoms of the partition into colonies has the same distribution under P a as 

( T j ~ r i-i : 1 ^ 3 < Va) , with r] a = inf{j : j - S% = a} . (1) 

This follows from Section 2 in [4]; see in particular Lemma 4 there. The interested reader 
may wish to provide an alternative proof of Lemma [j] based on this representation and using 
the strong Markov property for random walks in place of the extended branching property for 
Galton- Watson processes. 



3 Random walks, Levy processes, and passage times 

As our main goal is to investigate limits of partitions into colonies, Lemmas [1] and [2] suggest 
that we should study asymptotics of first passage times in random walks, which is the purpose 
of this section. We first introduce the asymptotic regimes that we shall consider later on, 
and develop some of their consequences for passage times of certain random walks and Levy 
processes. Our starting point is a classical result of convergence for rescaled Galton- Watson 
processes towards continuous state branching processes (in short, CSBP) that we now recall. 
We refer to the monograph [7J by Duquesne and Le Gall for a complete account, including some 
terminology which will not be defined here. 

We consider for each integer n a sequence (£ ni p. : k e N) of i.i.d. copies of some integer-valued 
random variable with mean at most 1 which should be thought of as the number of children of 
a typical individual in the n-th population model. The basic assumption is that there exists a 
sequence a(n) with lim^oo a(n)/n = oo such that 

- (Cn,l H h £n,Kn)t] - [«(«)*]) X t (2) 

for some (and then all) t > 0, where the notation ==>■ refers to convergence in distribution as 
n —>■ oo. More precisely, (|2J) then can be reinforced to weak convergence on the space of cadlag 
processes endowed with Skorohod's topology; see for instance Theorem 16.4 in [15]. Moreover, 
the limit X = (X t : t > 0) is necessarly a Levy process which has no negative jumps and does 
not drift to +oo, that is K(X t ) G [— oo, 0] (this follows from the requirement that E(£ n) fc) < 1 
for every n). 

The law of the Levy process X is characterized by its Laplace exponent if) : M + — > M. + which 
is defined by 

E(exp(-gX t )) = expfa%)) , q > 0. 
We shall further assume that X has infinite variation, or equivalently that 

lim q~ 1 if>(q) = oo . 

Next, consider a sequence (a(n) :nGN) with a{n)/n — ► a for some a > 0, and denote by 

a Galton- Watson process started from a(n) ancestors and with reproduction law given by the 

distribution of £ nj fc. Then we have 

£ y{ri) . 7 
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where (Z t : t > 0) is a CSBP started from Zq = a and with branching mechanism ip; see e.g. 
Theorem 2.1.1 in [7]. 

We now turn our attention to the spatial case where some children of a parent may emigrate. 
That is we consider an array k , £™ fc ) : k,n G N) of random variables with values in Z+, where 

fe should be thought of as the number of homebody children and as the number of migrant 
children of the k-th individual for the n-th population model; in particular £ n> fc = £* k + £™ k . 
We assume that for each fixed integer n, the sequence ((£k fc ,£™ fc ) : k e N) is i.i.d., and just as 
in the preceding section, we construct a pair of random walks 

S n ,k = £n,l + ' ' ' + £,n,k ~ k an d S™k = C™1 + " " " + , k G Z+. 

Observe that the random walk S™. is non-decreasing while 5*^. is downward skip free, i.e. its 
increments belong to { — 1, 0, 1, 2, . . .}. We now reinforce (j2J) by assuming that the Levy process 
X can be decomposed as a sum 

x t = x?+xr, 

where ((X t h ,X™) : t > 0) is a bivariate Levy process, in such a way that for some (and then 
all) t > 

We point out that again ([3]) is automatically reinforced to weak converge in the sense of Sko- 
rohod by an appeal to Theorem 16.4 in [T5] . 

We stress that necessarily, the Levy process X h has no negative jumps, infinite variation, and 
does not drift to +00, and that X m must be a subordinator (i.e. an increasing Levy process); 
the two may or not be correlated. We denote the bivariate Laplace exponent by that is 

E(exp -{qX^ + rX™)) = exp(t^(g, r)) , q, r > . 

In particular, there is the identity 

tfj(q) = ^(q,q), q > ; 

note also that our assumptions force \l/(g, q) > whereas \l/(0, r) < 0. 
Next, we consider the first passage process 

T x = m£{t > : X} < -x} , x > 

which is a subordinator whose Laplace exponent is given by the inverse function of \l/(-, 0); see 
Theorem VII. 1 in [3]. Using T. as a time-substitution, we also introduce the compound process 

Y X = X%, x>0. 

The distribution of the pair (T, Y) can be described as follows. 

Lemma 3 (i) The process 

((T X ,Y X ) :x>0) 
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is a bivariate subordinator. 

(ii) Its Laplace exponent $ : — > IR + defined by 

E(exp(— qT x — rY x )) = exp (— x<&(q, r)) , g, r > 
is determined as the unique solution to the equation 

$(\&(g,r),r) = g. 

(iii) There exists a unique measure A onl^\{(0,0)} with f(l A (x x + x 2 ))A(dx 1 dx 2 ) < oo such 
that 

r ) = y (1 - e-" Xl - ra2 )A(da;i dx 2 ) . 

In ot/ier words, the bivariate subordinator (T X ,Y X ) has no drift and Levy measure A. 

(iv) Finally, we also have 

E(Yi) = J x 2 A(dxi dx 2 ) < 1 . 

Proof: The proof is essentially a variation of that of Theorem VII. 1 in [3]. The passage times 
T x are stopping times in the natural filtration of the bivariate Levy process (X h , X m ) which 
are a.s. finite and such that X^ = —x (by the absence of negative jumps for X h and the 
fact that X h does not drift to +oo). The strong Markov property immediately implies that 
(T x , Y x ) is has independent and stationary increments; further this process has clearly cadlag 
non- decreasing sample paths in each coordinate. In other words, it is a bivariate subordinator. 

The Laplace exponent $ is then determined by an application of Doob's sampling theorem 
to the martingale 

exp(-(gX, h + rX™) - tV(q, r)) , t>0 

(recall that X^ = —x a.s.). Observe that our assumptions ensure that for every r > 0, the 
function , r) is continuous and convex with \l/(0, r) < and ^(oo,r) = oo, so the equation 
^(^(g, r), r) = q determines $ on M. 2 ^. 

It remains to check that both subordinators have no drift. We know from Corollary VII. 5 in 
[3] and the fact that X h has unbounded variation that lim^oo \l/(g, 0)/g = oo. This implies that 
Hindoo $(g, 0)/g = and hence T. has no drift. On the other hand, the Levy-Ito decomposition 
enables us to express the subordinator X m as the sum of a linear drift and a pure-jump process. 
The time-substitution by T x thus yields that Y x can be expressed as the sum of two pure-jump 
processes, and hence its drift coefficient must be zero. 

The penult displayed identity of the statement is just the celebrated Levy-Khintchine for- 
mula. Finally, the assumption that the Levy process X does not drift to +oo is equivalent 
to requiring that its first moment exists and is non-positive, E,(X t ) < 0. It follows that 
X t = X 4 h + Xf 1 is a super-martingale, and since T x is a stopping time, we deduce from Doob's 
sampling theorem that for every x, t > 

E(X- r J < E(-V 4 h AT J < x 



8 



where the second inequality is due to the definition of T x and the absence of negative jumps for 
X h . Then it suffices to let t — > oo to get by monotone convergence that E(Y^) < x, which in 
turn yields our last claim by an application of the Levy-Ito decomposition of the subordinator 
Y and the first-moment formula for Poisson measures. □ 

Lemmas [1] and [2] suggest that the asymptotic behavior of the distribution of the partition 
into colonies should be related to that of the first passage times of the downward skip free 
random walk 

ft,- 

T nd = mi{k : S^ k = -j}, jeN. 
In this direction, we point at the following limit theorem. 

Corollary 1 In the regime ([3]), we have for every bounded continuous function g : IR+ — > R 
with g(xi, X2) = 0(x\ + x%) as x\ + x% — > that 

U m nE( 9 (a(n)-V„, 1 ,n-^ T „ 1 ))^/ ^A^.d*), 
where the Levy measure A has been defined in Lemma\^ 

Proof: It follows from the assumptions (j3j) and routine arguments (recall that S\ . is down- 
wards skip free and that (j3J) can be reinforced to weak convergence of cadlag processes) that 
for an arbitrary x > 

(aW-V^.n- 1 ^^) =► (T„y«). (4) 

On the other hand, one readily deduces from the strong Markov property for random walks 
that for each fixed n, 

(r n , fc ,^J, k>0 

is a random walk with non- decreasing coordinates. We complete the proof by an appeal to 
Corollary 15.16 in Kallenberg [15] and Lemma □ 



4 A family of random point measures 

In this section, we introduce and develop some properties of a class of random point measures 
which will arise later on as limits for partitions into colonies. The idea stems from the represen- 
tation (PQ) of the sequence of the atoms of the partition into colonies. Indeed, as by the strong 
Markov property, the increments Tj — Tj-i of the first passage time process in a downward skip 
free random walk are i.i.d., the combination of ([T]) and the law of rare events suggest that if 
a limiting partition exists, then it should be described in terms of a Poisson random measure 
restricted to a random domain with a boundary given by a first passage time. 

Our basic analytic datum is some sigma-finite measure on M. 2 , with no mass at (0, 0) that 
will be denoted by A. Although in subsequent sections A will be chosen to be the Levy measure 
that arises in Lemma El this is specification is not required in the present section (of course, 
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the notation introduced here is coherent with that of Section 3). We write A^ 1 ) and A^ 2 ^ for the 
restrictions to (0, oo) of the two marginals of A and assume that A^ 1 ) is sigma-fmite and 



xA (2) (dx)<l. (5) 

(0,oo) 



We consider a Poisson measure M on (0, oo) x with intensity measure dt ® A(dxi da^), 
and denote by (t, A t ) = (t, A^ , Aj 2) ) a generic atom of M . Following the classical construction 
of Levy and Ito, we introduce the subordinator 

Y t = [ x 2 M{ds dx! dx 2 ) = V A^ , t>0. 



0<s<t 



Because Y has no drift and its Levy measure A( 2 ) fulfills (ED, we have E(*i) < 1. In particular, 
the first passage times 

a y = mf{t >0:t-Y t = y}, y>0 

are finite a.s. 

Next, for every t > 0, we consider the point measure on (0, oo) 



AS') 



0<s<t 



Note that A/" t is a Poisson random measure with intensity tA^\ and from the superposition 
property of Poisson measure, that the measure-valued process (jV t : t > 0) has independent 
and stationary increments. The random point measures we are interested in are defined by 
time-substitution through the passage times a y 

M v = < 1} = E 5 aP ■ (6) 

0<t<tJy 

In words, M. y is the image of the restriction of M to the random set (0, a y ] x M.+ by the 
projection (s, x±, x 2 ) — > x±. 

In the special case when the intensity measure A is carried by the diagonal {(x, x) : x > 0}, 
i.e. when A(dxi dx 2 ) = 5 Xl (dx 2 )A ( ^ 1 \dxi), we have As = a.s. and the random point 
measure A4 y coincides with the empirical measure of the sizes of the jumps performed by the 
subordinator Y during the time-interval (0,0"^]. This case has also a natural interpretation in 
terms of continuous state branching processes in discrete time (see Jifina [Hj). More precisely, 
the well-known correspondence between CSBP in discrete time and subordinators enables us 
to think of M. y as the empirical measure of the sizes of siblings in a CSBP in discrete time with 
reproduction intensity A^ and started from an initial population of size y. 

We now observe that the property of independence and stationarity of the increments for 
the process of point measures {Nf : t > 0) is preserved after the time-substitution by o y . 
This claim is essentially a variation of the well-known fact that the first passage process of a 
real- valued Levy process with no negative jumps is a subordinator; see e.g. Theorem VII. 1 in 

Ea- 
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Lemma 4 The measure-valued process (Ni y : y > 0) /ias independent and stationary incre- 
ments. 

Proof: Assume for a while that C 0oo \(l A a^A^^da;) < oo, which enables us to construct 
T t = (jV t (1) , Id) = I Xl N{ds dxi dx 2 ) = ^ A« , t>0. 

Plainly (T, Y) is a pure jump Levy process, more precisely it is a bivariate subordinator with 
no drift. Further, the Poisson measure N can be recovered from the jump process of (T,Y). 
The strong Markov property for Levy processes shows that for every y > 0, the shifted process 

(TX) t = (T,Y) av+t -(T,Y) ay , t>0 

is independent of ((T,Y) t : t < cr y ) and has the same law as (T, Y). As o- y+y i — a y coincides 
with the first passage time of the process t — ► t — Y( at level y', this establishes our claim. 

Finally, the assumption that f, 0oo \{l A x)A( 1 ^(dx) < oo can be removed by considering the 
image of N by a mapping (t,xi,x 2 ) — > (t,4>(x 1 ),x 2 ) for some appropriate bijective map (j) (recall 
that the measure A^ is sigma-finite). □ 

We next point at an interesting connexion between the distributions of Af^ and M. y which 
is an avatar of the classical ballot theorem (compare with Corollary VII. 3 in [3]). 

Proposition 1 There is the identity 

F(M y EA,a y E dt)dy = jF(Af t {1) eA,t-Y t e dy)dt , t > y > , 
where A denotes an arbitrary measurable subset of point measures on (0,oo). 

Proof: Introduce the random set 

TZ = {t:t-Y t = max(s - Y s )} . 

L 0<s<t 

The cyclic exchangeability property of the point measure M enables us to use a variation of 
the well-known combinatorial argument for the ballot theorem (see Takacs |20j) and get 

tE(g(t - Y t ) , A/; {1) G A and t G K) = E(g(t - Y t )(t - Y t ) + , Af t {1) G A) , 

where g : R — > [0, oo) stands for a generic measurable function. This easily yields the claim. □ 

Remark. In the special when the intensity measure A is carried by the diagonal, we have 
Y t = (N't , Id) a.s., and Proposition [1] shows that the distribution of M. y is essentially a mixture 
of laws of Poisson-Kingman partitions as defined by Pitman [17] . More precisely, suppose for 
simplicity that for every t > 0, the infinitely divisible variable Y t has an absolutely continuous 
law with a continuous density, say pt{-)- It then follows from Proposition [1] that 

¥(M y G •) = y [ ip(M (1) G • | (N t {1 \ ld)=t- y) Pt (t - y) dt, 
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where Afj is a Poisson random measure with intensity tA^\ Up-to a normalization, the 
conditional Poisson measures P(jV"/ 1 ^ G ■ | (N't, Id) = a) which appear in the integral above 
belong to the family of Poisson-Kingman partitions studied in depth by Pitman |17j . 

Next, for every Borel function / : (0, oo) — > 1R+ with compact support, we define the 
cumulant n(f) > by 

E(exp-(^ti,/))=exp(-«(/)), 

with the usual notation 

(M u f)= J2 f( A t 1] )- 

0<t<(71 

Observe from Lemma H] that for an arbitrary y > we have more generally 

E(exp-(M y J)) = exp(-yK(f)). (7) 

It is well-known that the cumulant k determines the law of A4i, in the sense that any random 
measure on (0, oo) having the same cumulant as AA\ is distributed as A4i, see for instance 
Lemma 12.1 in Kallenberg [T3]. We may now state the following basic result which provides 
the characteristic equation solved by the cumulant : 

Theorem 1 For every Borel function f : [0, oo) — > 1R + with compact support in (0,oo), the 
equation 



A 



/ (1 — exp(-f(xi) — Xx 2 )) A(dxidx 2 ) 
JtslI 



has a unique solution in [0, oo) which is given by A = 

Proof: For any random time R > 1, we see from elementary properties of Poisson random 
measures that A^ 1} = A/f } + NrIx where {N t : t > 0) is a process of point measures which is 
independent of the restriction of Af to [0, 1] x and has the same distribution as (Aft '■ t > 0). 
We then note that the first passage time o~\ is bounded from below by 1, and more precisely 
there is the identity 

a x = 1 + d(Y 1 ) , 

in the obvious notation, viz. 

a(y) = inf {t > : t — Y t = y} and Y t — x 2 7V r (ds dxi dx 2 ) ■ 

i[0,t]xR2_ 

Applying the preceding observation, we thus have 

^1=^=^+^,, (8) 

so we can deduce from ([7j) that 

«p(-«(/)) = E (exp -((Af^J) + Y lK (f))) ■ 
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From the very definitions of Af^ and Yi, we can re-write the preceding identity as 
exp(-*(/)) = E(ex V (-J2(f( A t 1] ) + <f) A ? 

\ \ 0<t<l 

= exp - / (1 - exp(-/(x x ) - K(f)x 2 )) A(dx : dx 2 ) 



where the last line is Campbell's identity. Thus n(f) solves (J6]). 
Uniqueness is now easy. Indeed the map 



F : A -> A 

has derivative 



/ (1 -exp(-/(xi) - Ax 2 ))A(dxidx 2 ) 



F'(X) = 1—1 x 2 exp(— f(xi) — Ax 2 )A(da;i dx 2 ) 
Jr 2 + 

which is positive due to (J5j). □ 

We now conclude this section by discussing a simple example. Suppose that A has support 
on the axes, i.e. 

A(dx 1 dx 2 ) = M 1 \dx 1 )5 (dx 2 ) + 5 (dxi)A ( - 2 \dx 2 ) . 
Then the equation in Theorem [1] can be re- written as 

(1 - e~ f{xi) ) A«(dn) = A - / (l - e^ 2 ) A (2) (dx 2 ) . 

(0,oo) i(0,oo) 

On the other hand, our assumption implies that the subordinator Y and the process of point 
measures are independent, and Ai\ = Afal is thus a mixed Poisson measure with intensity 
tA {1) and mixing law P(o"i G dt). In particular we have 

E(exp -(Mi, /)) = / exp f-t [ (1 - e - /(iri) )A (1) (dEi) J P(<n G dt) . 

J(0,oo) V J(0,od) J 

Now recall from Theorem VII. 1 in [3] that the Laplace transform of the first passage time o\ 
of the Levy process with no positive jumps t — Y t is given by 

E(e-" CT1 ) =exp(-<p(q)), q>0, 

where the cumulant cp is the unique solution to 

q = cp( q ) - [ (1 - e"^ (<?):!;2 )A (2) (da; 2 ) . 

</(0,oo) 

We conclude that 

K(f) = <p( f (l-e-^A^dxO 

\V(0,oo) 

which is thus in agreement with Theorem [TJ 
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5 Limit laws for partitions into colonies 



In this section, we state and prove the main limit theorem for distributions of partitions into 
colonies. We consider for each fixed integer n a Galton- Watson process with emigration started 
from a(n) ancestors that all occupy different sites, such that the number of homebody chil- 
dren and the number of migrant children £™fc) of the k-th individual is given by an i.i.d. 
sequence. We write V n for the partition into colonies induced by this model. 

We also consider a bivariate Levy process (X h , X m ) such that X h has no negative jumps and 
infinite variation, X m is a subordinator, and the sum X = X h + X m does not drift to +00. We 
write A for the Levy measure that arises in Lemma [31 and then (Ai y : y > 0) for the process 
of random point measures which has been studied in Section 4 for this specific choice of A. 

We are now able to state the main result of this work. 

Theorem 2 Write V n for the image of the partition into colonies V n by the reseating x — > 
x/a(n), viz. 

(Vn, f) = (V r J n ) 

where f n (x) = f(x/a(n)). Assume that the number of ancestors a(n) fulfills a(n) ~ an for 
some a > 0. Then in the regime ([3]), V n converges weakly on the space of sigma-finite measures 
on (0, 00) as n — > 00 towards M. a - 

The material developed so far suggests that the proof of Theorem [2] should consist of two 
steps, namely first a tightness property for the rescaled partitions into colonies, and then 
uniqueness of the limit of a subsequence that shall be derived by the analysis of cumulants. 
This is indeed the route that we will follow. 

Lemma 5 In the regime ([3]), the sequence of the distributions of the variables (V n ,ld), for 
nGN, is tight on the space of sigma-finite measures on (0, 00). 

Proof: Indeed, recall that 

(P n ,Id) = a{n)- l {V n M) = a(n)- x J^cf = C«/a(n) 
is simply the size ( n of the total population generated by the Galton- Watson process 

Z {n) 

renormalized by the factor l/a(n). It is well-known that in the regime ([3]), this quantity 
converges in distribution as n — > 00 towards the size of the total population of the CSBP Z, 
that is, equivalently, the first passage time of the Levy process X at level —a. Hence, the 
sequence in the statement is tight. □ 

For the second step of the proof of Theorem [2J we write K n for the cumulant of the rescaled 
random measure V n , and fix a Borel function / : (0, 00) — > 1R + with compact support. For the 
sake of simplicity, we will suppose in the sequel that a = 1, i.e. that a(n) ~ n, which induces 
no loss of generality thanks to the branching property and (J7J). 
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Lemma 6 Let f : [0, oo) — > R+ fre an arbitrary continuous function with compact support in 
(0, oo). Any limit point A o/ the sequence (K n (f) : n e N) fulfills 

A = / (1 — exp(— /(xi) — Ax 2 )) A(dxi dx 2 ) . 

Proof: Tracing back the definitions, we get 

exp(-K n (/)) = exp(-a(n).fC n (/ n )) 

/ i \ ~ \ a ( n ) 

= (l-ES n) (l-exp(-/ n (C)-ia/ n )M))J , 

where we used Lemma [1] for the last equality, and the notation E( n ) refers to the mathematical 
expectation corresponding to the n-th population model. Then recall from Lemma [2] that 

ES n) (l - exp(-/ n (C) - K n {f n )M)) 

= E<»> (l - exp f-f(C/a(n)) - 

= E fl - exp(-/(«H- 1 r nil ) - ^K^n-'S™^))^ . 

Recall also that n/a(n) — > 1 and K n (f) — ► A. We then apply Corollary [T] with ^(^1,^2) = 
1 — exp(— f(xi) — Ax 2 ) and derive the equation of the statement. □ 

The proof of Theorem [2] should now be plain. It follows from Lemma that the sequence 
of the laws of rescaled partitions into colonies V n is tight in the space of sigma-fmite measures 
on (0,oo). We then deduce from Prohorov's lemma (see, e.g., Lemma 16.15 in Kallenberg 
[15] ) that the sequence of the distributions of the random measures V n on (0, 00) is relatively 
compact. If V has the law of the limit of some sub-sequence, then we deduce from Lemma [6] 
that for an arbitrary continuous function / : [0, 00) — > K + with compact support in (0, 00), the 
cumulant 

#(/) = - In E(exp -<£,/)) 

solves 

K(f) = ! (l - exp(-/(x 1 ) - K(f)x 2 j) A(dx! dx 2 ) . 
Jr 2 + v ' 

We conclude from Theorem [1] that K(f) = n(f), and thus V has the same distribution as A4y. 

Remark. It may be interesting to point at a different route for establishing Theorem [2J which 
uses the representation (JT]) of the partition into colonies. Recall the notation there and the 
convergence in distribution Invoking Theorem 16.14 in [15], it is easy to check that the 
latter can be reinforced into weak convergence of cadlag processes in the sense of Skorohod. 
One can then deduce from a time-substitution that 

(a(n)"V„, H) fi 4 S„ m a(tl p TiiiH ) =^ (T x , Y x ) , 

where again the convergence holds in the sense of Skorohod. Loosely speaking, this entails the 
weak convergence of the increments of the random walk r n . rescaled by a factor l/a(n) to the 
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jump-process of the subordinator T. We know from the Levy-Ito decomposition that the latter 
can be described as a Poisson random measure whose intensity is expressed in terms of the 
Levy measure of T. It remains to recall that in this setting, the number 7 n of colonies fulfills 

7„ = min jfc : S 1 ™ k — k = — o(n)| 

and to check that 

n _1 7n =^ Oa = inf{£ > : t - Y t = a} . 

Some technical details needed to justify rigorously this approach may be tedious; they are 
circumvented here by the appeal to the characterization of the cumulant of the random measure 
M. a in Theorem [1] and the simple argument for tightness in Lemma [5j 

6 Examples 

In this section, we shall illustrate our main results for partitions into colonies by discussing 
some natural examples. Their common feature is that the distribution of the total number of 
children (homebodies and migrants) of a typical individual is fixed, that is the Galton- Watson 
process for which spatial locations of individuals are discarded has a fixed reproduction law. 
The differences in the models thus only appear through the repartition between homebody and 
migrant children. One could of course deal with much more general examples, however the 
present ones already exhibit a rich variety of asymptotic behaviors. 

Recall ([2]). Throughout this section, we consider an integer- valued random variable £ with 
unit mean, which belongs to the domain of attraction of a (completely asymmetric) stable 
variable with index (3 G (1, 2]. That is, there is a sequence (a(n) : n E N) which varies regularly 
with index (3 such that 

n' 1 (6 H 1- - a(n)) x i , 

where ((, : t £ N) is a sequence of i.i.d. copies of £ and now (X t : t > 0) a stable(/3) Levy 
process with no negative jumps. In other words, there is some b > such that 

E(exp(-qX t )) = exp(t^) , q > , 

i.e. ip(q) = bq 13 . The (continuous version of the) density of the variable X\ will be denoted by 
P> 

P(Xi G dx) = p(x)dx , 

so that, by scaling, 

F(X t G dx) = r 1/l3 p(r 1/l3 x)dx 

for every t > 0. 

6.1 Allelic partitions for rare neutral mutations 

We first deal with the classical model corresponding to neutral mutations. That is for each fixed 
integer n, the total number of children of the k-th individual is decomposed as = <^ k + 
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where conditionally on = £, the variable has the binomial distribution with parameter 
(£,p(n)) for some p(n) G (0, 1). In other words, we assume that each child choses to become a 
migrant with probability p(n), independently of the other individuals. 

If we now suppose that 

p(n) ~ cn/a{n) 

for some constant c > 0, so that the mutation rate is small when n is large, then clearly 
holds with X t h = X t — ct and X t m = ct, and thus 

\l/(g, r) = bq 13 + cq — cr , q, r > . 

We now see from Lemma E](ii) that the bivariate subordinator (T X ,Y X ) has Laplace exponent 
$(g, r) = (p(q + cr) where ip(q) = z is given by the nonnegative solution to 



bz + 



cz 



We stress that Y x = cT x a.s., and that the subordinator (T x : x > 0) has Laplace exponent yj. 

An easy consequence of a version of the Ballot Theorem (more precisely, cf. Corollary VII. 3 
in [3]) is that the Levy measure A^ of the subordinator of the first passage time of X h can be 
expressed in terms of the density of X h at 0. More precisely, one gets 

A (1) (dt) = lim x~ l ¥(T x G dt) 

a:->o+ t dx 
= t- 1 - 1 ^p(ct 1 - 1 /' 3 )dt, 

where the last equality follows from the fact that 

P(X t h G dx) = ¥(X t G ct + dx) = r 1//3 p(r 1//3 (a; + ct))dx . 

Recall that Y = cT; it follows that the Levy measure of the bivariate subordinator (T, Y) that 
determines the law of the limiting partition M. a is then given by 

A(dxi, dx 2 ) = x[ 1 1 ^ p(cx\ l ^)5 CXl (dx2)dxi . 

On the other hand, recall again from Corollary VII. 3 in |3J that 
¥(T x edt) xF(-X^edx) 



dt t dx 



xt- l - llt5 p{t- 1/p {ct-x)), t,x>0. 



We can combine this identity with the argument in the Remark following Proposition [T] to 
express the distribution of M. a as a mixture of laws of Poisson measures conditioned on their 
first moments (i.e. Poisson-Kingman partitions, see [17]) . More precisely, we have 



F(M a e-) = a [ ip(M (1) G ■ | t - Y t 

J(a,oo) t 



.F(t-Y t eda) u 

a) dr, 

da 
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where Af^ is a Poisson random measure with intensity t . We now get using to the identity 
Y = cT that 

nM a e •) 

= a I P(A/; (1) e-\T t = {t- a)/c)p{r l ^{ct - (t - a)/c))^^ t" 2 " 1 ^ dt. (9) 

7(0,00) C 

In the important case (3 = 2, which occurs whenever the reproduction law has finite variance, 
p is simply the Gaussian density and 

A«(di) = ^=exp dt. 

That is is the Levy measure of an inverse Gaussian subordinator, which is merely an 
exponential transform of the stable(l/2) Levy measure. Recall that the exponential transform 
plays no role for the distribution of Poisson measures conditioned on their first moments, 
which thus reduces the description (jUj) of the law of M. a to the more usual stable(l/2) Levy 
measure. This situation has been investigated in depth by Pitman who has obtained a number 
of formulas for distributions related to such Poisson-Kingman partitions; see Section 8 in [TTj . 
In particular Pitman has established sampling formulas in terms of Hermite functions which 
provide extensions of the celebrated one due to Ewens [5]. 

6.2 One-type siblings 

We consider now an example related to the fragmentation process at nodes of the stable tree 
which has been considered by Miermont [16J; see also [lj. Specifically, we suppose henceforth 
that j3 < 2, and for each fixed value of the parameter n 6 N, all the children of an individual 
are homebodies with a probability that decays exponentially in the size of the sibling, and all 
children are migrant otherwise. More precisely, conditionally on = £, the event ££fc = ^ 
and £™ fc = occurs with probability e~ £//n , while the event £jj fc — o and = i occurs with 
probability 1 — e~^ n . 

In this situation, it is easy to check that (j5J) holds with 

XT = J2 M*x s >e s} AX s and X* = X t - X? , 

0<s<t 

where AX S stands for the size of the jump (if any) of the stable(/3) process X at time s and e s for 
an independent standard exponential mark which is attached to each jump of X. Well-known 
properties of Levy processes entail that the processes X h and X m are independent and that X h 
is an (Esscher) exponential transform of X. More precisely, the bivariate Laplace exponent of 
(X h ,X m ) is then given by 

r) = b((q + if - 1) + b{r p + 1 - (r + if) . 

Since the law of X t h is simply an exponential transform of that of X t , 

P(X t h G dx) = t- 1/(3 e- tx - bt p{t- 1/p x)dx , 
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and we deduce from the Ballot Theorem that 

A (1) (dt) = p(0)t- 1 - 1//3 e" w dt . 

As the first passage process T. is a subordinator which is independent of X m , and Y = X m o T 
results from Bochner's subordination, and we get that the Levy measure A of (T, Y) can be 
expressed in the form 

A(dx l ,dx 2 ) = p(0)a;^ 1_1//3 e- 6xi dx 1 P(X^ e dx 2 ) . 



6.3 Migration forced by cut-off 

In the preceding two examples, the subordinator X m was either deterministic (a pure drift) or 
independent of the Levy process X h . Our last example shows that more general situations may 
arise. Specifically, we consider the parameter n as a threshold and decide that at most n of 
of the children of each individual are homebodies and the rest are migrants. In other words, 
^ = ^Anandq i = (^-n)+. 

Then (jSJ) is fulfilled with 

X™= 1 { ax s >i}(AX s -1) and X} = X t - X 4 m . 

0<s<i 

We stress that the jump times of X m are exactly the times when X h has a jump of size 1; in 
particular the processes X h and X m are not independent. 
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